15 research outputs found

    Adaptive Seeding for Gaussian Mixture Models

    Full text link
    We present new initialization methods for the expectation-maximization algorithm for multivariate Gaussian mixture models. Our methods are adaptions of the well-known KK-means++ initialization and the Gonzalez algorithm. Thereby we aim to close the gap between simple random, e.g. uniform, and complex methods, that crucially depend on the right choice of hyperparameters. Our extensive experiments indicate the usefulness of our methods compared to common techniques and methods, which e.g. apply the original KK-means++ and Gonzalez directly, with respect to artificial as well as real-world data sets.Comment: This is a preprint of a paper that has been accepted for publication in the Proceedings of the 20th Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2016. The final publication is available at link.springer.com (http://link.springer.com/chapter/10.1007/978-3-319-31750-2 24

    Estymacja parametr贸w modeli mieszanin rozk艂ad贸w normalnych przy pomocy metody hybrydowej 艂膮cz膮cej samoadaptacyjn膮 ewolucj臋 r贸偶nicow膮 z algorytmem EM

    No full text
    In the paper the problem of learning of Gaussian mixture models (GMMs) is considered. A new approach based on hybridization of a self-adaptive version of differential evolution (DE) with the classical EM algorithm is described. In this approach, called DEEM, the EM algorithm is run until convergence to fine-tune each solution obtained by the mutation and crossover operators of DE. To avoid the problem with parameter representation and infeasible solutions we use a method in which the covariance matrices are encoded using their Cholesky factorizations. In a simulation study GMMs were used to cluster synthetic datasets differing by a degree of separation between clusters. The results of experiments indicate that DE-EM outperforms the standard multiple restart expectation-maximization algorithm (MREM). For datasets with high number of features it also outperforms the state of-the-art random swap EM (RSEM).W pracy poruszono problem uczenia modeli mieszanin rozk艂ad贸w normalnych. Zaproponowano nowe podej艣cie, nazwane DE-EM, oparte na hybrydyzacji samoadaptacyjnego algorytmu ewolucji r贸偶nicowej i klasycznego algorytmu EM. W nowej metodzie rozwi膮zanie otrzymane jako wynik operator贸w mutacji i krzy偶owania jest poddawane optymalizacji lokalnej, prowadzonej a偶 do momentu uzyskania zbie偶no艣ci, przez algorytm EM. Aby unikn膮膰 problemu z reprezentacj膮 macierzy kowariancji i niedopuszczalno艣ci rozwi膮za艅 u偶yto metody, w kt贸rej macierze kowariancji s膮 kodowane przy pomocy dekompozycji Cholesky鈥檈go. W badaniach symulacyjnych modele mieszanin rozk艂ad贸w normalnych zastosowano do grupowania danych syntetycznych. Wyniki eksperyment贸w wskazuj膮, 偶e metoda DE-EM osi膮ga lepsze wyniki ni偶 standardowa technika wielokrotnego startu algorytmu 藱 EM. Dla zbior贸w danych z du偶膮 liczb膮 cech, metoda osi膮ga lepsze wyniki ni偶 technika losowej wymiany rozwi膮za艅 po艂膮czona z algorytmem EM

    Uczenie sko艅czonych mieszanin rozk艂ad贸w normalnych przy pomocy algorytmu ewolucji r贸偶nicowej

    No full text
    In the paper the problem of parameter estimation of finite mixture of multivariate Gaussian distributions is considered. A new approach based on differential evolution (DE) algorithm is proposed. In order to avoid problems with infeasibility of chromosomes our version of DE uses a novel representation, in which covariance matrices are encoded using their Cholesky decomposition. Numerical experiments involved three version of DE differing by the method of selection of strategy parameters. The results of experiments, performed on two synthetic and one real dataset indicate, that our method is able to correctly identify the parameters of the mixture model. The method is also able to obtain better solutions than the classical EM algorithm. Keywords: Gaussian mixtures, differential evolution, EM algorithm.W artykule rozwa偶ono problem uczenia parametr贸w sko艅czonej mieszaniny wielowymiarowych rozk艂ad贸w normalnych. Zaproponowano now膮 metod臋 uczenia opart膮 na algorytmie ewolucji r贸偶nicowej. W celu unikni臋cia problem贸w z niedopuszczalno艣ci膮 chromosom贸w algorytm ewolucji r贸偶nicowej wykorzystuje now膮 reprezentacj臋, w kt贸rej macierze kowariancji s膮 reprezentowane przy pomocy dekompozycji Cholesky鈥檈go. W eksperymentach wykorzystano trzy wersje algorytmu ewolucji r贸偶nicowej r贸偶ni膮ce si臋 metod膮藳 doboru parametr贸w. Wyniki eksperyment贸w, przeprowadzonych na dw贸ch syntetycznych i jednym rzeczywistym zbiorze danych, wskazuj膮 偶e zaproponowana metoda jest w stanie poprawnie identyfikowa膰 parametry modelu. Metoda ta osi膮ga r贸wnie偶 lepsze wyniki ni偶 klasyczyny algorytm EM

    Uczenie sieci neuronowych hybrydowym algorytmem opartym na differential evolution

    No full text
    A new hybrid method for feed forward neural network training, which combines differential evolution algorithm with a gradient-based approach is proposed. In the method, after each generation of differential evolution, a number of iterations of the conjugate gradient optimization algorithm is applied to each new solution created by the mutation and crossover operators. The experimental results show, that in comparison to the standard differential evolution the hybrid algorithm converges faster. Although this convergence is slower than that of classical gradient based methods, the hybrid algorithm has significantly better capability of avoiding local optima.W artykule przedstawiono now膮, hybrydow膮 metod臋 uczenia sieci neuronowych, 艂膮cz膮c膮 w sobie algorytm Differential Evolution z podej艣ciem gradientowym. W nowej metodzie po ka偶dej generacji algorytmu Differential Evolution, ka偶de nowe rozwi膮zanie, powsta艂e w wyniu dzia艂ania operator贸w krzy偶owania i mutacji, poddawane jest kilku iteracjom algorytmu optymalizacji wykorzystuj膮cego metod臋 gradient贸w sprz臋偶onych.Wyniki eksperyment贸w wskazuj膮, 偶e nowy, hybrydowy algorytm ma szybsz膮 zbie偶no艣膰 ni偶 standardowy algorytm Differential Evolution. Mimo, i偶 zbie偶no艣膰 ta jest wolniejsza, ni偶 w przypadku klasycznych metod gradientowych, algorytm hybrydowy potrafi znacznie lepiej unika膰 minim贸w lokalnych

    Learning decision rules using a distributed evolutionary algorithm

    No full text
    A new parallel method for learning decision rules from databases by using an evolutionary algorithm is proposed. We describe an implementation of EDRL-MD system in the cluster of multiprocessor machines connected by Fast Ethernet. Our approach consists in a distribution of the learning set into processors of the cluster. The evolutionary algorithm uses a master-slave model to compute the fitness function in parallel. The remiander of evolutionary algorithm is executed in the master node. The experimental results show, that for large datasets our approach is able to obtain a significant speed-up in comparison to a single processor version

    Cost-Sensitive Decision Trees with Pre-pruning

    No full text

    An Evolutionary Algorithm Using Multivariate Discretization for Decision Rule Induction

    No full text
    We describe EDRL-MD, an evolutionary algorithm-based system, for learning decision rules from databases. The main novelty of our approach lies in dealing with continuous - valued attributes. Most of decision rule learners use univariate discretization methods, which search for threshold values for one attribute at the same time. In contrast to them, EDRL-MD simultaneously searches for threshold values for all continuous-valued attributes, when inducing decision rules. We call this approach multivariate discretization. Since multivariate discretization is able to capture interdependencies between attributes it may improve the accuracy of obtained rules. The evolutionary algorithm uses problem specific operators and variable-length chromosomes, which allows it to search for complete rulesets rather than single rules. The preliminary results of the experiments on some real-life datasets are presented
    corecore